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A major issue in educational research involves taking into consideration the 
multilevel nature of the data. Since the late 1980s, attempts have been made to 
model social science data that conform to a nested structure. Among other 
models, two-level structural equation modelling or two-level path modelling and 
hierarchical linear modelling are two of the techniques that are commonly 
employed in analysing multilevel data. Despite their advantages, the two-level 
path models do not include the estimation of cross-level interaction effects and 
hierarchical linear models are not designed to take into consideration the 
indirect effects. In addition, hierarchical linear models might also suffer from 
multicollinearity that exists among the predictor variables. This paper seeks to 
investigate other possible models, namely the use of latent constructs, indirect 
paths, random slopes and random intercepts in a hierarchical model. 

Multilevel data analysis, suppressor variables, multilevel mixture modelling, 
hierarchical linear modelling, two-level path modelling 


INTRODUCTION 

In social and behavioural science research, data structures are commonly hierarchical in 
nature, where there are variables describing individuals at one level of observation and 
groups or social organisations at one or more higher levels of observation. In educational 
research, for example, it is interesting to examine the effects of characteristics of the school, 
the teacher, and the teaching as well as student characteristics on the learning or development 
of individual students. However, students are nested within classrooms and classrooms are 
nested within schools, so the data structure is inevitably hierarchical or nested. 

Hierarchical data structures are exceedingly difficult to analyse properly and as yet there does 
not exist a fully developed method for how to analyse such data with structural equation 
modelling techniques (Hox, 1994, as cited in Gustafsson and Stahl, 1999). Furthermore, 
Gustafsson and Stahl (1999) mentioned that there are also problems in the identification of 
appropriate models for combining data to form meaningful and consistent composite 
measures for the variables under consideration. 

Two commonly used approaches in modelling multilevel data are two-level structural 
equation modelling or two-level path modelling and hierarchical linear modelling. Despite 
their advantages, the two-level path models currently employed do not include the estimation 
of cross-level interaction effects; and hierarchical linear models are not designed to take into 
consideration the latent constructs as well as the indirect paths. In addition, some other 
problems are associated with the use of HLM, such as fixed X-variables with no errors of 
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measurement, limited modelling possibilities and like any regression the analysis also suffers 
from the multicollinearity that exists among the predictor variables. The multicollinearity 
issue is considered in the following section because discussion of this issue is not only highly 
relevant, but is also rarely undertaken. 

MULTICOLLINEARITY AND SUPPRESSOR VARIABLE 

Since Horst (1941) introduced the concept of the ‘suppressor variable’, this problem has 
received only passing attention in the now nearly two-thirds of a century since it was first 
raised. In its classical rendering Conger (1974) argued that a suppressor variable was a 
predictor variable, that had a zero (or close to zero) correlation with the criterion, but 
nevertheless contributed to the predictive validity of a test. 

Three types of suppressor variables have been identified. Conger (1974) labelled them as 
traditional, negative and reciprocal. Cohen and Cohen (1975) named the same categories 
classical, net, and cooperative. To describe these three types of suppression, suppose that 
there are the criterion variable Y and two predictor variables, Xi and X 2 . 


Classical Suppression 

A classical suppression occurs when a predictor variable has a zero correlation with the 
criterion but is highly correlated with another predictor in the regression equation. In other 
words, r Yi * 0 , r Y2 = 0 , and r 12 ^ 0 . in order to understand the meaning of these coefficients 
it is useful to consider the Venn diagram shown in Figure 1 . 



Figure 1. A Venn diagram for classical suppression 

Here the presence of X 2 increases the multiple correlation (R ), even though it is not 
correlated with Y. What happens is that X 2 suppresses some of what would otherwise be error 
variance in X\. 


Cohen et al. (2003, p.70) gave the formula for the multiple correlation coefficient for two 
predictors and one criterion as a function of their correlation coefficients: 


pi _ r r\ + r n 2r n r 72 r 12 

^ 7.12 “ , 2 


1 — r. 


( 3 ) 


12 


Since r Y2 = 0 , equation (3) can be simplified as 
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2 2 

Because r j 2 must be greater than 0, the denominator is less than 1.0. That means that R Y j 2 

must be greater than r 2 YI . In other words, even though X 2 is not correlated with Y, having it 
in the equation raises the R 2 from what it would have been with just X\. The general idea is 
that there is some kind of noise (error) in X\ that is not correlated with Y, but is correlated 
with X 2 . By including X 2 this noise is suppressed (accounted for) leaving X\ as an improved 
predictor of Y. The magnitude of the R 2 y.i 2 depends of the values of r 12 and r !Y as can be seen 
in Figure 2, where the multiple correlation ( R 2 y.i 2 ) for different values of r I2 and for the 
different correlations between X\ and Y have been presented. In some cases, the R 2 yn value 
can be greater than 1 . 

Cohen et al. (2003, p. 68) gave the formula for the j3yu and (3 Y2 .i coefficients as follows: 

O _ r Yl ~ r F2 r 12 


P) 2 ) ~ 


Y2 


1-r, 


12 


(5) 



Figure 2. The inflation of R 2 y.i 2 
Since r Y2 = 0, Equation (5) can be simplified as 

Pn2=^T and &2.i=^ (6) 

l ~ r u l-h2 

The sign of (5 Y2 .i depends on the sign of rj 2 . If there is a negative correlation between X\ and 
X 2 , the sign of fi Y2 , / will be the same as the sign of f5 Y i. 2 . If there is a positive correlation 
between X\ and X 2 , the sign of / 3 Y2 j and /5 Y i. 2 will be the opposite as can be seen in Figure 3. 
When /5 Y2 j has a positive sign, Kras and Wilkinson (1986) labelled it as ‘positive classical 
suppression’, and when /5 Y2 .i has a negative sign they labelled it as ‘negative classical 
suppression’. The magnitude of the inflations of /5 Y2 .i and (5 Y u from their bivariate 
correlation with the criterion, r Y2 and r Y2 also depend on the value of r 22 . A higher the value of 
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r 12 leads to bigger inflations of fiyu and /3y2.i and beyond a certain point the value of / 3yu and 
[3)2, i can exceed 1. 


Pyi.2 for ri2 = 0.6 or (- 0 . 6 ) 
3yi.2 f° r >" 1 2 = 0.3 or (-0.3) 

3 Y 1.2 for r 12 = 0.0 
3 y2.i for r 1 2 = -0.6 

Py2.i for ri 2 = -0.3 
Py2.i for n 2 = 0.0 
Py2.i for ri 2 = 0.3 

Py2.i for r 1 2 = 0.6 


-1 -0.5 0 0.5 1 

Correlation between X ^ and Y (r y1 ) 

Figure 3. Classical suppression 
Net suppression 

This type of suppression occurs when a predictor variable has a regression weight with an 
opposite sign to its correlation with the criterion. In other word, r n ^ 0 , r Y2 ^ 0 , and 
r n ± Obut the [3y2.i is opposite in sign to r Y2 - In order to understand the meaning of these 
coefficients it is useful to consider the Venn diagram shown in Figure 4. 




Figure 4. A Venn diagram for net suppression 

Here the primary function of X 2 is to suppress the error variance Xj, rather than influencing 
substantially Y. As can be seen in Figure 4 X 2 has much more in common with the error 
variance in Xj than it does with the variance in Y. This can happens when X 2 is highly 
correlated with Xj but weakly correlated with Y. 

In Figure 5 various [3 y2 j values for r 12 = 0.6 and r n - -0.6 have been plotted. If X 2 is 

positively correlated with Y but has a negative value of [3 Y2 ,i , Krus and Wilkinson (1986) 
labelled it as ‘negative net suppression’. If X 2 is negatively correlated with Y but has a 
positive value of /3 Y2 .i , Krus and Wilkinson (1986) called it ‘positive net suppression’. 
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Cooperative suppression 

Co-operative suppression occurs when the two predictors are negatively correlated with each 
other, but both are positively or negatively correlated with 7. This is a case where each 
variable accounts for more of the variance in Y when it is in an equation with the other than it 
does when it is presented alone. As can be seen in Figure 6, when r 12 is set to -0.6, the value 
of R 2 is more highly boosted as r Y2 increases. When both X 2 and X 2 are positively correlated 
with 7, Krus and Wilkinson (1986) labelled it as “positive cooperative suppression”; and 
when both X 2 and X 2 are negatively correlated with 7, Krus and Wilkinson (1986) labelled it 
as ’negative cooperative suppression’ as shown in Figure 7. 


ri2 = (-0.6) 



Figure 6. R 2 values in Cooperative Suppression 
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r-12 — -0.6 



Figure 7. Cooperative Suppression 

Cohen and Cohen (1983) suggested that one indication of suppression is a standardised 
regression coefficient (/?,) that falls outside the interval 0 < /?, < ry,. To paraphrase Cohen and 
Cohen (1983), if X, has a (near) zero correlation with Y, then there is possible classical 
suppression present. If its b, is opposite in sign to its correlation with Y, there is net 
suppression present. And if its b f exceeds r Y i and it has the same sign, there is cooperative 
suppression present. 

Multicollinearity has adverse effects not only on the regression and the multiple correlation 
coefficients, but also on the standard errors of regression coefficients as well as on the 
accuracy of computations due to rounding errors. In order to detect such problems concepts 
of a ‘variance inflation factor’ (VIF) and ‘tolerance’ were introduced (Pedhazur, 1997; Cohen 
et al., 2003). 


VIF, = 

1 -Rf 

1 9 

Tolerance = = 1 - R 
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The smaller the tolerance or the higher the VIF, the greater are the problems arising from 
multicollinearity. There is no agreement on cut-off values of tolerance. BMDP uses a 
tolerance of 0.01 as a default cut-off for entering variables, MINITAB and SPSS use a 
default value of 0.0001 (Pedhazur, 1997, p. 299). Cohen et al. (2003, p. 423) suggested that 
any VIF of 10 or more provides evidence of serious multicollinearity, which is equal to a 
tolerance of 0.1. Furthermore, they argued that “the values of the multicollinearity indices at 
which the interpretation of regression coefficients may become problematic will often be 
considerably smaller than traditional rule of thumb guidelines such as VIF =10”. Sellin 
(1990) used the squared multiple correlation between a predictor and the set of remaining 
predictors involved in the equation (R 2 ) to indicate the relative amount of multicollinearity, 
He mentioned that relatively large values, typically those larger than 0.5, which is equal to 
VIF = 2, may cause problems in the estimation. 

SOME ALTERNATIVE STRATEGIES 

When a researcher is concerned only with the prediction of Y, multicollinearity has little 
effect and no remedial action is needed (Cohen et al., 2003 p.425). However, if interest lies 
in the value of regression coefficients or in the notion of causation, multicollinearity may 
introduce a potentially serious problem. Pedhazur (1997) and Cohen et al. (2003) proposed 
some strategies to overcome this problem that included (a) model respecification, (b) 
collection of additional data, (c) using ridge regression, and (d) principal components 
regressions. 

When two or more observed variables are highly correlated, it may be possible to create a 
latent variable, that can be used to represent a theoretical construct which cannot be observed 
directly. The latent construct is presumed to underlie those observed highly correlated 
variables (Byrne, 1994). 

The authors of this article have focused on this strategy, to create latent constructs and to 
extend the hierarchical linear model to accommodate the latent constructs. It also seeks to 
include indirect paths into the hierarchical linear model with the latent predictor. Thus, an 
attempt has been made to combine the strengths of the two common approaches in analysing 
multilevel data: (a) two-level path models that can estimate direct and indirect effects at two 
levels, can use latent constructs as predictor variables, but can not estimate any cross-level 
interaction; and (b) hierarchical linear models that can estimate direct and cross-level 
interaction effects, but can not estimate indirect paths nor use latent constructs as predictor 
variables. Muthen and Muthen (2004) have developed a routine called ‘multilevel mixture 
modelling’ that can estimate a two-level model which has latent constructs as predictor 
variables, direct and indirect paths, as well as cross-level interactions. 

DATA AND VARIABLES 

The data used in this study were collected from 1,984 junior secondary students in 71 classes 
in 15 schools in Canberra, Australia. Information was collected about individual student 
socioeconomic status (father’s occupation), student aspirations (expected occupation, 
educational aspirations (expected education), academic motivation, attitude towards science 
(like science), attitude towards school in general (like school), self-regard, prior science 
achievement and final science achievement (outcome). In addition, information on class sizes 
was also collected. The outcome measure was the scores on a science achievement test of 55 
items. 

The names, codes and description of the predictor variables tested for inclusion at each level 
have been given in Table 1 . 
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Table 1: Variables tested at each level of the hierarchy 


Level 

Variable 

code 

Variable description 

Level- 1 


(Student-level) 

Student 

FOCC 

Father's occupation (l=Professional, . . . , 6=Unskilled labourer) 

Background 

EXPOCC 

Expected occupation (l=Professional, . . . , 6=Unskilled labourer) 

(N=1984) 

EXPED 

Expected education (l=Year 10 and Below, . . . ; 6=Higher Degree) 


ACAMOT 

Academic motivation (0=Lowest motivation, . . . , 40=Highest motivation) 


LIKSCH 

Like school (0=Likes school least, . . . , 34=Likes school most) 


LIKSCI 

Like science (l=Likes science least, . . . , 40=Likes science most) 


SELREG 

Self regard (l=Lowest self regard, . . . , 34=Highest self regard) 


ACH68 

Prior science achievement (0=Lowest score, . . . , 25=Highest score) 

Level-2 


(Class-level) 

Class Characteristics 

CSIZE 

Class size (8=Smallest, . . . , 39=Largest) 

Group 

FOCC2 

Average father occupation at class-level 

Composition 

(n=71) 

"EXPOCC^ 

EXPED 2 

Average expected occupation at class-level 
Average expected education at class-level 


ACAMOT 2 

Average academic motivation at class-level 


LIKSCH 2 

Average like school at class-level 


LIKSCI 2 

Average like science at class-level 


SELREG^ 

Average self regard at class-level 


"ACH68]2 

Average prior science achievement 

Outcome 

ACH69 

Science Achievement (1 =lowest score. . ..55=highest score) 


HLM MODEL: THE INITIAL MODEL 

Initially a two-level model was fitted using HLM 6. The first step in the HLM analyses was 
to run a fully unconditional model in order to obtain the amounts of variance available to be 
explained at each level of the hierarchy (Bryk and Raudenbush, 1992). The fully 
unconditional model contained only the dependent variable (Science achievement, ACH) and 
no predictor variables were specified at the class level. The fully unconditional model is 
stated in equation form as follows. 

Level- 1 model 

Yij = po, + eij 


Level-2 model 


Poi - Yoj + roj 


( 10 ) 


where: 


Yy is the science achievement of student i in class j; 

The second step undertaken was to estimate a Level- 1 model, that is, a model with student- 
level variables as the only predictors in Equation 10. This involved building up the student- 
level model or the so-called ‘unconditional’ model at Level- 1 by adding student- level 
predictors to the model, but without entering predictors at the other level of the hierarchy. At 
this stage, a step-up approach was followed to examine which of the eight student-level 
variables (listed in Table 1) had a significant (at p<0.05) influence on the outcome variable, 
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ACH69. Four variables (FOCC, EXPED, LIKSCI and ACH68) were found to be significant 
and therefore were included in the model at this stage. These four student-level variables 
were grand-mean-centred in the HLM analyses so that the intercept term would represent the 
ACH69 score for student with average characteristics. 

The final step undertaken was to estimate a Level-2 model, which involved adding the Level- 
2 or class-level predictors into the model using the step-up strategy mentioned above. At this 
stage, the Level-2 exploratory analysis sub-routine available in HLM 6 was employed for 
examining the potentially significant Level-2 predictors in successive HLM runs. Following 
the step-up procedure, two class-level variables (CSIZE and ACH68_2) were included in the 
model for the intercept. In addition, one cross-level interaction effect between ACH68 and 
CSIZE was included in the model. 

The final model at Levels 1, and 2 can be denoted as follows. 

Level- 1 Model 

Yy = poj + pij*(FOCC) + p 2 j*(EXPED) + p 3j *(LIKSCI) + p 4j *(ACH68) + r s 
Level-2 Model 

Poj = Too + Yoi*(ACH68_2) + y 02 *(CSIZE) + u oj 
Pij = Yio + uij 
P 2 j = Y20 + U2j 
p3j = Y30 + U 3 j 

P 4 j = Y4o + Y4i*(CSIZE) + u 4 j (11) 

The next step was to re-estimate the final model using the MPLUS program. The results of 
the estimates of fixed effects from the two-level model are given in Table 2 for HLM and 
MPLUS estimation. 


RESULTS 

At the student-level, from the results in Table 2 it can be seen that Science achievement was 
directly influenced by Father's occupation (FOCC), Expected education (EXPED), Like 
science (LIKSCI) and Prior achievement (ACH68). When other factors were equal, students 
whose fathers had high status occupations (e.g. medical doctors and lawyers) outperformed 
students whose fathers had low status occupations (e.g. labourer and cleaners). Students who 
aspired to pursue education to high levels were estimated to achieve better when compared to 
students who had no such ambitions, while students who liked science were estimated to 
achieve better when compared to students who did not like science. In addition, students who 
had high prior achievement scores were estimated to achieve better than students who had 
low prior achievement scores. 

At the class-level, from the results in Table 2 it can be seen that Science achievement was 
directly influenced by Average prior achievement (ACH68_2) and Class size (CSIZE). When 
other factors were equal, students in classes with high prior achievement scores were likely to 
achieve better when compared to students in classes with low prior achievement scores. 
Importantly, there was considerable advantage (in term of better achievement in science) 
associated with being in larger classes. These relationships have been shown in Figure 8. 



Darmawan and Keeves 


169 


From the results in Table 2 it can also be seen that there is one significant cross-level 
interaction effect ACH68 and CSIZE. This interaction is presented in Figure 9. Nevertheless, 
in interpreting the effects of class size, it should be noted that 10 out of the 15 schools in 
these data had a streaming policy that involved placing high achieving students in larger 
classes and low achieving students in smaller classes for effective teaching. Therefore, the 
better performance of the students in larger classes in these data was not surprising. 


Table 2. HLM and MPLUS results for initial model 


Level 1 
N=1984 

Level 2 
n=71 

HLM 

Estimate (se) 

MPLUS 
Estimate (se) 

Intercept 


28.37 (0.20) 

28.87 (0.19) 


ACH68 2 

0.78 (0.10) 

0.76 (0.12) 


CSIZE 

0.16(0.04) 

0.16(0.04) 

FOCC 


-0.25 (0.09) 

-0.24 (0.10) 

EXPED 


0.48 (0.09) 

0.49 (0.09) 

LIKSCI 


0.15 (0.01) 

0.15(0.01) 

ACH68 


0.91 (0.04) 

0.93 (0.04) 


CSIZE 

0.013 (0.005) 

0.015 (0.006) 



Figure 8. Model 1: Initial Model (MPlus results used) 


ALTERNATIVE MODELS 

Two alternative models, Model 2 and Model 3, were estimated using MPLUS 3.13. Both 
EXPED and EXPOCC are significantly correlated with ACH69 with correlation coefficients 
of 0.50 and 0.35 respectively. Either EXPED or EXOCC can have a significant effect on 
ACH69. However, if the two variables were put together as predictors of ACH69, only 
EXPED was found to be significant. Since there is a relatively high correlation between 
EXPOCC and EXPED (-0.53) it is possible to form a latent construct, labelled as aspiration 
(ASP), and use this construct as a predictor variable instead of just using either EXPOCC or 
EXPED. In this way, both variables (EXPOCC and EXPED) become significant reflectors of 
aspiration. Otherwise, EXPOCC may be regarded as an insignificant predictor of science 
achievement as in the initial model. The results have been recorded in Table 3 and Model 2 is 
shown visually in Figure 10. This employment of a latent construct is very useful in 
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situations where three observed predictor variables are available and suppressor relationships 
occur if all three predictor variables are introduced separately into the regression equation. 



Figure 9. Interaction effect between CSIZE and PRIORACH 



Figure 10. Model 2: With latent construct 

The next step undertaken was to estimate another model with two additional indirect paths. It 
was hypothesised that academic motivation (ACAMOT) influenced like science at the 
student level and average father’s occupational status influences average prior achievement at 
the class level. The results are recorded in Table 3 and Model 3 is shown in Figure 11. 

The proportions of variance explained at each level for each model are presented in Table 4. 
For Model 1, the initial model, 45 per cent of variance available at Level 1 and almost all (95 
%) of variance available at Level 2 have been explained by the inclusion of four variables at 
Level 1 (FOCC, EXPED, LIKSCI, and ACH68) and two variables at Level 2 (ACH68 and 
CSIZE) as well as one interaction effect between ACH68 and CSIZE. Overall this model 
explained 68.7 per cent of total variance available when the model was estimated with HLM. 
MPLUS estimations are very close to HLM estimations. Adding a latent construct into the 
model did not really increase the amount of variance explained, but it did give a more 
coherent picture of the relationships. This is also true for Model 3 when indirect paths are 
added. 
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Table 3. Model 2 and Model 3 Results 


Level 1 
(N=1984) 

Criterion ACH69 

Level 2 
(n=71) 

Model 2 

with latent construct 
estimate (se) 

Model 3 

with latent construct 
and indirect paths 
estimate (se) 

Latent Construct 
ASP by 
EXPED 


1.00(0.00) 

1.00(0.00) 

EXPOCC 


-0.63(0.10) 

-0.63(0.11) 

Indirect Paths 
ACAMOT on LIKSCI 

Fixed Effects 
Intercept 

FOCC 2 on ACH68 

28.87 (0.20) 

0.56 (0.03) 
-2.56 (0.30) 

28.85 (0.20) 


ACH68 

0.75 (0.12) 

0.77 (0.12) 


CSIZE 

0.17(0.05) 

0.17(0.05) 

FOCC 


-0.23 (0.10) 

-0.22 (0.10) 

ASP 


0.62 (0.14) 

0.61 (0.14) 

LIKSCI 


0.15(0.01) 

0.15(0.01) 

ACH68 


0.93 (0.04) 

0.93 (0.04) 


CSIZE 

0.014(0.007) 

0.015(0.01) 


CONCLUSIONS 

Multicollinearity is one of the problems that need to be examined carefully when a multiple 
regression model is employed. When the main concern is merely the prediction of Y, 
multicollinearity generally has little effect, but if the main interest lies in the value of 
regression coefficients, multicollinearity may introduce a potentially serious problem. 

Multilevel mixture modelling, which can estimate a two-level model that has latent 
constructs as predictor variables, direct and indirect paths, as well as cross-level interactions, 
has been used as an alternative strategy to analyse multilevel data. In a sense, this approach 
can be seen as an attempt to combine the strengths of the two commonly used techniques in 
analysing multilevel data, two level path modelling and hierarchical linear modelling. 

The initial model was a hierarchical linear model, which was fitted using both HLM 6 and 
MPLUS 3.13. Both estimations yielded similar results. The main effects reported from the 
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analysis at the student-level, indicate that in addition to prior achievement, it was the social 
psychological measures associated with the differences between students within classrooms 
that were having effects, namely, socioeconomic status, educational aspirations, and attitudes 
towards learning science. About 55 per cent of the variance between students within 
classrooms was left unexplained, indicating that there were other student-level factors likely 
to be involved in influencing student achievement. 


Table 4. Variance components 


Model (N=1984, n=71) 

HLM 



MPLUS 



Level 1 

Level 2 

Total 

Level 1 

Level 2 

Total 

Null Model 







Variance Available 

38.07 

33.85 

71.92 

38.07 

33.35 

71.42 

Initial Achievement (Residual) 

24.25 

9.34 

33.59 

24.33 

8.45 

32.78 

Total Variance Explained % 

36.3 

72.4 

53.3 

36.1 

74.7 

54.1 

Total Variance Unexplained % 

63.7 

27.6 

46.7 

63.0 

25.3 

45.9 

Model 1: Initial Model (Residual) 

20.93 

1.60 

22.53 

21.01 

1.46 

22.46 

Total Variance Explained % 

45.0 

95.3 

68.7 

44.8 

95.6 

68.6 

Total Variance Unexplained % 

55.0 

4.7 

31.3 

55.2 

4.4 

31.4 

Model 2: With Latent Predictor (Residual) 




21.36 

1.49 

22.84 

Total Variance Explained % 




43.9 

95.5 

68.0 

Total Variance Unexplained % 




56.1 

4.5 

32.0 

Model 3: Add indirect Paths (Residual) 




21.36 

1.50 

22.85 

Total Variance Explained % 




43.9 

95.5 

68.0 

Total Variance Unexplained % 




56.1 

4.5 

32.0 


At the classroom level, about 4.7 per cent of the variance between classes was left 
unexplained, with the average level of prior achievement of the class group had a significant 
effect. In addition, class size had a positive effect on science achievement, with students in 
larger classes doing significantly better than students in smaller classes. Perhaps, this 
indicates the confounding effect of streaming policy adopted by some schools to place better 
students in larger classes. In addition, the interaction effect also reveals that the effect of prior 
achievement is stronger in larger classes. High achieving students are better off in larger 
classes. 

The next step was to add a latent construct, aspiration to the initial model. The estimation of 
this model was done by using the two-level mixture model procedure in MPLUS 3.13. By 
creating this latent construct, it could be said that aspiration, which was reflected 
significantly by expected education and expected occupation, had a positive effect on 
achievement. 

The last step was to add two indirect paths, one at the student level and one at the class level. 
At the student level, academic motivation was found to have a significant effect on like 
science and indirectly influence achievement through like science. At the class level, average 
fathers’ occupation was related to average prior achievement. 

By using multilevel mixture modelling, the limitations of hierarchical linear modelling are 
partly reduced. The ability to include latent constructs in a path model reduces the problem of 
multicollinearity and multiple measures. The inclusion of indirect paths also increases the 
modelling possibilities. However, these estimations need greater computing power if larger 
models are to be examined. 
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